Miscommunication handling in spoken dialog systems based on error-aware dialog state detection
نویسندگان
چکیده
With the exponential growth in computing power and progress in speech recognition technology, spoken dialog systems (SDSs) with which a user interacts through natural speech has been widely used in human-computer interaction. However, error-prone automatic speech recognition (ASR) results usually lead to inappropriate semantic interpretation so that miscommunication happens easily. This paper presents an approach to error-aware dialog state (DS) detection for robust miscommunication handling in an SDS. Non-understanding (Non-U) and misunderstanding (Mis-U) are considered for miscommunication handling in this study. First, understanding evidence (UE), derived from the recognition confidence, is adopted for Non-U detection followed by Non-U recovery. For Mis-U with the recognized sentence containing uncertain recognized words, the partial sentences obtained by removing potentially misrecognized words from the input utterance are organized, based on regular expressions, as a tree structure to tolerate the deletion or rejection of keywords resulting from misrecognition for Mis-U DS modeling. Latent semantic analysis is then employed to consider the verified words and their n-grams for DS detection, including Mis-U and predefined Base DSs. Historical information-based n-grams are employed to find the most likely DS for the SDS. Several experiments were performed with a dialog corpus for the restaurant reservation task. The experimental results show that the proposed approach achieved a promising performance for Non-U recovery and Mis-U repair as well as a satisfactory task success rate for the dialogs using the proposed method.
منابع مشابه
Error Handling in the RavenClaw Dialog Management Framework
We describe the error handling architecture underlying the RavenClaw dialog management framework. The architecture provides a robust basis for current and future research in error detection and recovery. Several objectives were pursued in its development: task-independence, ease-ofuse, adaptability and scalability. We describe the key aspects of architectural design which confer these propertie...
متن کاملError Handling in the RavenClaw Dialog Management Architecture
We describe the error handling architectture underlying the RavenClaw dialog management framework. The architecture provides a robust basis for current and future research in error detection and recovery. Several objectives were pursued in its development: task-independence, ease-ofuse, adaptability and scalability. We describe the key aspects of architectural design which confer these properti...
متن کاملOn the Utility of Decision-Theoretic Hidden Subdialog
A spoken dialog system typically characterizes a domain task with multiple states interconnected by actions or thresholds as transitions between states. As the system attempts to solicit a piece of information from the user, it may have to engage in a hidden subdialog, or error handling within a particular state, before transitioning to a new state. Hidden subdialogs generally center on illocut...
متن کاملPragmatic Issues in Handling Miscommunication: Observations of a Spoken Natural Language Dialog System
As with human-human interaction, human-computer dialog will contain situations where there is miscom-munication. This paper describes phenomena observed in the handling of miscommunication by an experimental spoken natural language dialog system capable of variable initiative behavior, the Circuit Fix-It Shop. In general, the 141 dialogs obtained from human interaction with this system indicate...
متن کاملError-correction detection and response generation in a spoken dialogue system
Speech understanding errors in spoken dialogue systems can be frustrating for users and difficult to recover from in a mixed-initiative spoken dialogue system. Handling such errors requires both detecting error conditions and adjusting the response generation strategy accordingly. In this paper, we show that different response wording choices tend to be associated with different user behaviors ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- EURASIP J. Audio, Speech and Music Processing
دوره 2017 شماره
صفحات -
تاریخ انتشار 2017